估计和对外部干扰的反应对于二次驾驶的稳健飞行控制至关重要。现有的估计器通常需要针对特定​​的飞行方案或具有大量现实世界数据的培训进行重大调整,以实现令人满意的性能。在本文中,我们提出了一个神经移动范围估计器(Neuromhe),该估计量可以自动调整由神经网络建模并适应不同飞行方案的MHE参数。我们通过将MHE估计值的分析梯度推导出相对于可调参数的分析梯度实现这一目标,从而使MHE无缝嵌入作为神经网络中的无缝嵌入以进行高效学习。最有趣的是,我们证明可以从递归形式的卡尔曼过滤器有效地解决梯度。此外,我们开发了一种基于模型的策略梯度算法,可以直接从轨迹跟踪误差中训练神经元,而无需进行基础真相干扰。通过在各种具有挑战性的飞行中对四摩特的模拟和物理实验,通过模拟和物理实验对神经元的有效性进行了广泛的验证。值得注意的是,NeuroMhe的表现优于最先进的估计器,仅使用2.5%的参数量,力估计误差降低了49.4%。所提出的方法是一般的,可以应用于其他机器人系统的稳健自适应控制。
translated by 谷歌翻译
对外部干扰的估计和反应对于对准轮运动器的鲁棒控制是根本的重要性。现有估计人通常需要大量数据,包括地面真理的大量数据,以实现令人满意的性能。本文提出了一种数据有效的可微分运动地平线估计(DMHE)算法,可以在线自动调整MHE参数,并适应不同的场景。我们通过从MHE相对于调谐参数导出估计的轨迹的分析梯度来实现这一点,使能够进行自动调整的端到端学习。最有趣的是,我们表明可以从递归形式的卡尔曼滤波器有效地计算梯度。此外,我们开发了一种基于模型的策略梯度算法,可以直接从轨迹跟踪误差中学习参数,而无需对实际真理。所提出的DMHE可以进一步嵌入为具有用于联合优化的其他神经网络的层。最后,我们通过在四轮官上的模拟和实验中展示了所提出的方法的有效性,其中检查了突然有效载荷变化和飞行中的具有挑战性的情景。
translated by 谷歌翻译
With the fast development of big data, it has been easier than before to learn the optimal decision rule by updating the decision rule recursively and making online decisions. We study the online statistical inference of model parameters in a contextual bandit framework of sequential decision-making. We propose a general framework for online and adaptive data collection environment that can update decision rules via weighted stochastic gradient descent. We allow different weighting schemes of the stochastic gradient and establish the asymptotic normality of the parameter estimator. Our proposed estimator significantly improves the asymptotic efficiency over the previous averaged SGD approach via inverse probability weights. We also conduct an optimality analysis on the weights in a linear regression setting. We provide a Bahadur representation of the proposed estimator and show that the remainder term in the Bahadur representation entails a slower convergence rate compared to classical SGD due to the adaptive data collection.
translated by 谷歌翻译
Model counting is a fundamental problem which has been influential in many applications, from artificial intelligence to formal verification. Due to the intrinsic hardness of model counting, approximate techniques have been developed to solve real-world instances of model counting. This paper designs a new anytime approach called PartialKC for approximate model counting. The idea is a form of partial knowledge compilation to provide an unbiased estimate of the model count which can converge to the exact count. Our empirical analysis demonstrates that PartialKC achieves significant scalability and accuracy over prior state-of-the-art approximate counters, including satss and STS. Interestingly, the empirical results show that PartialKC reaches convergence for many instances and therefore provides exact model counting performance comparable to state-of-the-art exact counters.
translated by 谷歌翻译
While mislabeled or ambiguously-labeled samples in the training set could negatively affect the performance of deep models, diagnosing the dataset and identifying mislabeled samples helps to improve the generalization power. Training dynamics, i.e., the traces left by iterations of optimization algorithms, have recently been proved to be effective to localize mislabeled samples with hand-crafted features. In this paper, beyond manually designed features, we introduce a novel learning-based solution, leveraging a noise detector, instanced by an LSTM network, which learns to predict whether a sample was mislabeled using the raw training dynamics as input. Specifically, the proposed method trains the noise detector in a supervised manner using the dataset with synthesized label noises and can adapt to various datasets (either naturally or synthesized label-noised) without retraining. We conduct extensive experiments to evaluate the proposed method. We train the noise detector based on the synthesized label-noised CIFAR dataset and test such noise detector on Tiny ImageNet, CUB-200, Caltech-256, WebVision and Clothing1M. Results show that the proposed method precisely detects mislabeled samples on various datasets without further adaptation, and outperforms state-of-the-art methods. Besides, more experiments demonstrate that the mislabel identification can guide a label correction, namely data debugging, providing orthogonal improvements of algorithm-centric state-of-the-art techniques from the data aspect.
translated by 谷歌翻译
Robots are traditionally bounded by a fixed embodiment during their operational lifetime, which limits their ability to adapt to their surroundings. Co-optimizing control and morphology of a robot, however, is often inefficient due to the complex interplay between the controller and morphology. In this paper, we propose a learning-based control method that can inherently take morphology into consideration such that once the control policy is trained in the simulator, it can be easily deployed to robots with different embodiments in the real world. In particular, we present the Embodiment-aware Transformer (EAT), an architecture that casts this control problem as conditional sequence modeling. EAT outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired robot embodiment, past states, and actions, our EAT model can generate future actions that best fit the current robot embodiment. Experimental results show that EAT can outperform all other alternatives in embodiment-varying tasks, and succeed in an example of real-world evolution tasks: stepping down a stair through updating the morphology alone. We hope that EAT will inspire a new push toward real-world evolution across many domains, where algorithms like EAT can blaze a trail by bridging the field of evolutionary robotics and big data sequence modeling.
translated by 谷歌翻译
Persuasion modeling is a key building block for conversational agents. Existing works in this direction are limited to analyzing textual dialogue corpus. We argue that visual signals also play an important role in understanding human persuasive behaviors. In this paper, we introduce the first multimodal dataset for modeling persuasion behaviors. Our dataset includes 199 dialogue transcriptions and videos captured in a multi-player social deduction game setting, 26,647 utterance level annotations of persuasion strategy, and game level annotations of deduction game outcomes. We provide extensive experiments to show how dialogue context and visual signals benefit persuasion strategy prediction. We also explore the generalization ability of language models for persuasion modeling and the role of persuasion strategies in predicting social deduction game outcomes. Our dataset, code, and models can be found at https://persuasion-deductiongame.socialai-data.org.
translated by 谷歌翻译
Deep reinforcement learning has recently emerged as an appealing alternative for legged locomotion over multiple terrains by training a policy in physical simulation and then transferring it to the real world (i.e., sim-to-real transfer). Despite considerable progress, the capacity and scalability of traditional neural networks are still limited, which may hinder their applications in more complex environments. In contrast, the Transformer architecture has shown its superiority in a wide range of large-scale sequence modeling tasks, including natural language processing and decision-making problems. In this paper, we propose Terrain Transformer (TERT), a high-capacity Transformer model for quadrupedal locomotion control on various terrains. Furthermore, to better leverage Transformer in sim-to-real scenarios, we present a novel two-stage training framework consisting of an offline pretraining stage and an online correction stage, which can naturally integrate Transformer with privileged training. Extensive experiments in simulation demonstrate that TERT outperforms state-of-the-art baselines on different terrains in terms of return, energy consumption and control smoothness. In further real-world validation, TERT successfully traverses nine challenging terrains, including sand pit and stair down, which can not be accomplished by strong baselines.
translated by 谷歌翻译
Graphene quantum dots provide a platform for manipulating electron behaviors in two-dimensional (2D) Dirac materials. Most previous works were of the "forward" type in that the objective was to solve various confinement, transport and scattering problems with given structures that can be generated by, e.g., applying an external electrical field. There are applications such as cloaking or superscattering where the challenging problem of inverse design needs to be solved: finding a quantum-dot structure according to certain desired functional characteristics. A brute-force search of the system configuration based directly on the solutions of the Dirac equation is computational infeasible. We articulate a machine-learning approach to addressing the inverse-design problem where artificial neural networks subject to physical constraints are exploited to replace the rigorous Dirac equation solver. In particular, we focus on the problem of designing a quantum dot structure to generate both cloaking and superscattering in terms of the scattering efficiency as a function of the energy. We construct a physical loss function that enables accurate prediction of the scattering characteristics. We demonstrate that, in the regime of Klein tunneling, the scattering efficiency can be designed to vary over two orders of magnitudes, allowing any scattering curve to be generated from a proper combination of the gate potentials. Our physics-based machine-learning approach can be a powerful design tool for 2D Dirac material-based electronics.
translated by 谷歌翻译
The unsupervised anomaly localization task faces the challenge of missing anomaly sample training, detecting multiple types of anomalies, and dealing with the proportion of the area of multiple anomalies. A separate teacher-student feature imitation network structure and a multi-scale processing strategy combining an image and feature pyramid are proposed to solve these problems. A network module importance search method based on gradient descent optimization is proposed to simplify the network structure. The experimental results show that the proposed algorithm performs better than the feature modeling anomaly localization method on the real industrial product detection dataset in the same period. The multi-scale strategy can effectively improve the effect compared with the benchmark method.
translated by 谷歌翻译